194 research outputs found
Logic-based assessment of the compatibility of UMLS ontology sources
Background: The UMLS Metathesaurus (UMLS-Meta) is currently the most comprehensive effort for integrating independently-developed medical thesauri and ontologies. UMLS-Meta is being used in many applications, including PubMed and ClinicalTrials.gov. The integration of new sources combines automatic techniques, expert assessment, and auditing protocols. The automatic techniques currently in use, however, are mostly based on lexical algorithms and often disregard the semantics of the sources being integrated. Results: In this paper, we argue that UMLS-Meta’s current design and auditing methodologies could be significantly enhanced by taking into account the logic-based semantics of the ontology sources. We provide empirical evidence suggesting that UMLS-Meta in its 2009AA version contains a significant number of errors; these errors become immediately apparent if the rich semantics of the ontology sources is taken into account, manifesting themselves as unintended logical consequences that follow from the ontology sources together with the information in UMLS-Meta. We then propose general principles and specific logic-based techniques to effectively detect and repair such errors. Conclusions: Our results suggest that the methodologies employed in the design of UMLS-Meta are not only very costly in terms of human effort, but also error-prone. The techniques presented here can be useful for both reducing human effort in the design and maintenance of UMLS-Meta and improving the quality of its contents
Language Model Analysis for Ontology Subsumption Inference
Pre-trained language models (LMs) have made significant advances in various
Natural Language Processing (NLP) domains, but it is unclear to what extent
they can infer formal semantics in ontologies, which are often used to
represent conceptual knowledge and serve as the schema of data graphs. To
investigate an LM's knowledge of ontologies, we propose OntoLAMA, a set of
inference-based probing tasks and datasets from ontology subsumption axioms
involving both atomic and complex concepts. We conduct extensive experiments on
ontologies of different domains and scales, and our results demonstrate that
LMs encode relatively less background knowledge of Subsumption Inference (SI)
than traditional Natural Language Inference (NLI) but can improve on SI
significantly when a small number of samples are given. We will open-source our
code and datasets
Describing images using qualitative models and description logics
Special Issue:Â Qualitative spatial and temporal reasoning: emerging applications, trends, and directionsOur approach describes any digital image qualitatively by detecting regions/objects inside it and describing their visual characteristics (shape and colour) and their spatial characteristics (orientation and topology) by means of qualitative models. The description obtained is translated into a description logic (DL) based ontology, which gives a formal and explicit meaning to the qualitative tags representing the visual features of the objects in the image and the spatial relations between them. For any image, our approach obtains a set of individuals that are classified using a DL reasoner according to the descriptions of our ontolog
Machine Learning-Friendly Biomedical Datasets for Equivalence and Subsumption Ontology Matching
Ontology Matching (OM) plays an important role in many domains such as
bioinformatics and the Semantic Web, and its research is becoming increasingly
popular, especially with the application of machine learning (ML) techniques.
Although the Ontology Alignment Evaluation Initiative (OAEI) represents an
impressive effort for the systematic evaluation of OM systems, it still suffers
from several limitations including limited evaluation of subsumption mappings,
suboptimal reference mappings, and limited support for the evaluation of
ML-based systems. To tackle these limitations, we introduce five new biomedical
OM tasks involving ontologies extracted from Mondo and UMLS. Each task includes
both equivalence and subsumption matching; the quality of reference mappings is
ensured by human curation, ontology pruning, etc.; and a comprehensive
evaluation framework is proposed to measure OM performance from various
perspectives for both ML-based and non-ML-based OM systems. We report
evaluation results for OM systems of different types to demonstrate the usage
of these resources, all of which are publicly available as part of the new
BioML track at OAEI 2022.Comment: Accepted paper in the 21st International Semantic Web Conference
(ISWC-2022); DOI for Bio-ML Dataset: 10.5281/zenodo.651008
Working group report on Semantic Technologies in Collaborative Applications
Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. T. Riechert, E. J. Ruiz, I. Cantador, M. Engler, D. T. Michaelides, M. Bortenschläger, and R. Tolksdorf, "Working group report on Semantic Technologies in Collaborative Applications", in WETICE '06. 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, 2006, Manchester (United Kingdom), pp. 347 - 351.The 1st International Workshop on Semantic Technologies in Collaborative Applications STICA 06 brought together researchers in the field of semantics-enabled collaboration. The presentations covered various aspects of the field and showed clear indications for future collaborations
XML-based approaches for the integration of heterogeneous bio-molecular data
Background: The today's public database infrastructure spans a very large collection of heterogeneous biological data, opening new opportunities for molecular biology, bio-medical and bioinformatics research, but raising also new problems for their integration and computational processing. Results: In this paper we survey the most interesting and novel approaches for the representation, integration and management of different kinds of biological data by exploiting XML and the related recommendations and approaches. Moreover, we present new and interesting cutting edge approaches for the appropriate management of heterogeneous biological data represented through XML. Conclusion: XML has succeeded in the integration of heterogeneous biomolecular information, and has established itself as the syntactic glue for biological data sources. Nevertheless, a large variety of XML-based data formats have been proposed, thus resulting in a difficult effective integration of bioinformatics data schemes. The adoption of a few semantic-rich standard formats is urgent to achieve a seamless integration of the current biological resources. </p
OM-2017: Proceedings of the Twelfth International Workshop on Ontology Matching
shvaiko2017aInternational audienceOntology matching is a key interoperability enabler for the semantic web, as well as auseful tactic in some classical data integration tasks dealing with the semantic heterogeneityproblem. It takes ontologies as input and determines as output an alignment,that is, a set of correspondences between the semantically related entities of those ontologies.These correspondences can be used for various tasks, such as ontology merging,data translation, query answering or navigation on the web of data. Thus, matchingontologies enables the knowledge and data expressed with the matched ontologies tointeroperate
- …